Hierarchical Multi-scale Attention Networks for action recognition
نویسندگان
چکیده
منابع مشابه
Hierarchical Multi-scale Attention Networks for action recognition
Recurrent Neural Networks (RNNs) have been widely used in natural language processing and computer vision. Among them, the Hierarchical Multi-scale RNN (HM-RNN), a kind of multi-scale hierarchical RNN proposed recently, can learn the hierarchical temporal structure from data automatically. In this paper, we extend the work to solve the computer vision task of action recognition. However, in seq...
متن کاملHierarchical Attention Network for Action Recognition in Videos
Understanding human actions in wild videos is an important task with a broad range of applications. In this paper we propose a novel approach named Hierarchical Attention Network (HAN), which enables to incorporate static spatial information, short-term motion information and long-term video temporal structures for complex human action understanding. Compared to recent convolutional neural netw...
متن کاملMulti-Scale Action Recognition in Squash Match
Algorithms for human action recognition usually observe human motion only on particular level of detail. This approach requires complex algorithms to match the complexity of motion. High recognition rates are possible, when actions are distinct and clearly visible. However, this is not the case in many practical applications. To solve this we explore the possibility of developing more general a...
متن کاملAction Recognition with Joint Attention on Multi-Level Deep Features
We propose a novel deep supervised neural network for the task of action recognition in videos, which implicitly takes advantage of visual tracking and shares the robustness of both deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In our method, a multi-branch model is proposed to suppress noise from background jitters. Specifically, we firstly extract multi-level dee...
متن کاملMulti-scale image semantic recognition with hierarchical visual vocabulary
Local features have been proved to be effective in image/video semantic analysis. The BOVW (bag of visual words) scheme can cluster local features to form the visual vocabulary which includes an amount of words, where each word is the center of one clustering feature. The vocabulary is used to recognize the image semantic. In this paper, a new scheme to construct semantic-binding hierarchical v...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Signal Processing: Image Communication
سال: 2018
ISSN: 0923-5965
DOI: 10.1016/j.image.2017.11.005